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pyrE gene sequence 

SEQ ID N0:1 

Pyre/niger Length: 1578 March 9, 2001 09:28 Type: N Check: 2282 



1 


GGGTTAATGT 


GAAGGCGTTA 


GTGGTAATGT 


ATATTAATGG 


TGAGATGGGC 


51 


TTTGATTGGG 


TTTAATTGGA 


AT C TGTATAT 


T TT CAGATGG 


AGTCAACTTT 


101 


TGAATGGCCA 


AT AT AT C C T C 


GGCGAT AC CG 


TCGGAGATAA 


GATAAGAATA 


151 


ATCGCACACT 


ATTCCCAAAG 


CATACTGGTA 


CATACTGCAT 


TCGGCTAGTG 


201 


CGGGGTGCTT 


ACCTCATCCA 


CCCGAATGAG 


CCCAACTTTT 


TTGTCTCAAT 


251 


CAATAATTGC 


AT C C AAATT C 


CCCCGCAACT 


TCCCCCTCCA 


ACCCCGTGTC 


301 


TATACCACTC 


CCTCCACACC 


CACACAAT C A 


CAATGGCTCT 


CCCTGCCTAC 


351 


AAGACCGCCT 


TCCTGGAGTC 


TCTCGTCGGC 


CAACGTGCTG 


ACTTTCGGCA 


401 


CCTTCACCCT 


GAAGT CGGGT 


CGCCGTGCGT 


CACCCCTCCA 


ACACCGGCAT 


451 


TATCGCAATC 


GGAAGAC TTA 


CCACTGTATA 


CAGACTCCCC 


CTACTTCTTC 


501 


AACGCCGGCA 


TCTTCAACAC 


CGCCTCTCTC 


CTCTCCGCCC 


TCTCCACCAT 


551 


GGCCCACACC 


AT CAT C AC CT 


TCCTCGCTGA 


GAACCCTTCC 


ATCCCCAAGC 


601 


CCGACGTCAT 


GCTTCGGGTA 


AAAAACCCCC 


TCTTTCCCCA 


ATACCCCACT 


651 


TCCACTCAAC 


AAC C CAT AAA 


TAACTAACAA 


AAACCCCCTA 


AACAGCCCCG 


701 


CATACAAAGG 


CATCCCCCTC 


GCGTGCGCCA 


CCCTCCTTGA 


ACTCAACCGC 


751 


ATCGACCCCG 


CCACCTGGGG 


CAGCGTGTCC 


TACAGCTACA 


ACCGCAAAGA 


801 


AG C C AAGG AT 


CACGGCGAAG 


GCGGCAACAT 


TGTCGGCGCC 


GCTCTGAAGG 


851 


GCAAGAC CGT 


GCTTGTGATC 


GACGATGTCA 


TCACGGCCGG 


TACCGCCATG 


901 


CGTGAGACCC 


TCAACCTGGT 


CG C CAAGGAG 


GGCGGCAAGG 


TCGTCGGATT 


951 


CACTGTTGCT 


CTGGACCGCT 


TGGAGAAGAT 


GCCCGGACCC 


AAGGACGAGA 


1001 


ACGGTGTCGA 


GGACGATAAG 


CCCAGAATGA 


GTGCTATGGG 


TCAGATCCGT 


1051 


AAGGAGTATG 


GTGTGCCCAC 


GACGAGTATT 


GTTACTCTGG 


ATGATTTGAT 


1101 


CAAGTTGATG 


CAGGCGAAGG 


GCAATGAGGC 


CGATATGAAG 


CGGTTGGAGG 


1151 


AGTATAGGGC 


TAAGTATCAG 


GCTAGTGATT 


AGTCGGTTTC 


ATTGACCGAT 



FIG. 15A 



12 01 TGTTTGGGTG GGTGTGAGAG GT T AGGTTAG GTTGTGGGCG TAGGAATGAA 
1251 AAGCTGTATA CATAGGGGCC TGAAGAGGTG CGTAGAGACG GT CGTGAGAT 

13 01 GTTTTATGTC AAAATCTTGA ACAAATGACA CCTTAAAAAA GACCCCTTGG 

13 51 TTTCAGCTGA ATTAGCCCGG AAAGATGCTC GGCACGC CAT GAGTCTAGCC 

14 01 CACTCAGTGG GCACCCGTTT CCCACATTTG AAGTGGC CGA CGCTTATTTG 
14 51 GCTGAGGCTG TGGCCTGGAA AGGCACTATG GCGTGCTGCG GTACAAGGCC 
1501 GGGGCTGGCG TACGAACCAC GACGCCCGAA GGGAACTCTT CGGTCTTACT 
1551 ACTACTATGT CCCCAGTTGA CCCCCCGA 



SEQ ID NO: 2 

O Translation of pyrE (1-1578) 
j|| Universal code 

61 

U ! 1 GGGTTAATGTGAAGGCGTTAGTGGTAATGTATATTAATGGTGAGATGGGCTTTGATTGGG 
21 CC C AATTAC ACTT C CG CAAT C AC C ATTAC AT AT AAT T AC CACT C TAC C CGAAAC T AAC C C 

7] 1 GLM*RR*W*CILMVRWALIG 

1 G*CEGVSGNVY*W*DGL*LG 
4= 1 VNVKALVVMYINGEMGFDWV 



I! 61 



T TTAAT TGGAAT C TGTATATT TT C AGATGGAGT CAAC TT T TGAATGGC C AATAT AT C CT C 
AAATTAAC C TTAGAC AT AT AAAAGT CTAC C T CAGT TGAAAACTTAC CGGT T AT ATAGGAG 



21 FNWNLYI FRWSQLLNGQYIL 

U 21 LIGICIFSDGVNF*MANISS 
M> 21 *LESVYFQMESTFEWPIYPR 



121 



41 
41 
41 



181 



GG CGAT AC CGT CGGAGATAAGATAAGAATAAT CG C ACACT ATT C C CAAAGCAT ACTGGTA 
CCGCTATGGCAGCCTCTATTCTATTCTTATTAGCGTGTGATAAGGGTTTCGTATGACCAT 

GDTVGDKIRI IAHYSQS ILV 
AIPSEIR*E*SHTIPKAYWY 
RYRRR* DKNNRTLFPKHTGT 

CATACTGCATTCGGCTAGTGCGGGGTGCTTACCTCATCCACCCGAATGAGCCCAACTTTT 
GTATGACGTAAGCCGATCACGCCCCACGAATGGAGTAGGTGGGCTTACTCGGGTTGAAAA 



61 HTAFG*CGVLTSSTRMSPTF 



61 
61 



ILHSASAGCLPHPPE *AQLF 
YCIRLVRGAYLIHPNEPNFF 



FIG. 15B 



TTGTCTCAATCAATAATTGCATCCAAATTCCCCCGCAACTTCCCCCTCCAACCCCGTGTC 
AACAGAGT T AGT T ATT AAC GT AGGTTT AAGGGGGCGT TGAAGGGGGAGGT TGGGG CAC AG 

LSQS I IASKFPRNFPLQPRV 
CLNQ*LHPNSPATSPSNPVS 
VSINNCIQIPPQLPPPTPCL 

TATACCACTCCCTCCACACCCACACAATCACAATGGCTCTCCCTGCCTACAAGACCGCCT 
ATATGGTGAGGGAGGTGTGGGTGTGTTAGTGTTACCGAGAGGGACGGATGTTCTGGCGGA 

YTTPSTPTQSQWLSLPTRPP 
I PLPPHPHNHNGSPCLQDRL 
YHSLHTHTITMALPAYKTAF 



TCCTGGAGTCTCTCGTCGGCCAACGTGCTGACTTTCGGCACCTTCACCCTGAAGTCGGGT 
AGGAC C T CAGAGAGC AG C CGGTTG CACGACTGAAAG C CGTGGAAGTGGGACTT CAGC C CA 
???????????????????????? 

SWSLSSANV LTFGTFTLKSG 
PGVSRRPTC*LSAPSP*SRV 
LESLVGQRADFRHLHPEVGS 

INTRON I 

CGCCGTGCGTCACCCCTCCAACACCGGCATTATCGCAATCGGAAGACTTACCACTGTATA 

GCGGCACGCAGTGGGGAGGTTGTGGCCGTAATAGCGTTAGCCTTCTGAATGGTGACATAT 

RRASPLQHRHYRNRKTYHC I 
AVRHPSNTGI IAIGRLTTVY 
PCVTPPTPALSQSEDLPLYT 



CAGACTCCCCCTACTTCTTCAACGCCGGCATCTTCAACACCGCCTCTCTCCTCTCCGCCC 
GT C TGAGGGGGATGAAGAAGT TG CGGC CGT AGAAGT TGTGGC GGAGAGAGGAGAGGCGGG 

QTPPTSSTPASSTPPLSSPP 
RLPLLLQRRHLQHRLS PLRP 
D S PYFFNAGI FNTASLLSAL 

Ncol 

TCTCCACCATGGCCCACACCATCATCACCTTCCTCGCTGAGAACCCTTCCATCCCCAAGC 
AGAGGT GGTACCGGGTGTGGTAGTAGTGGAAGGAGCGACTCTTGGGAAGGTAGGGGTT CG 

SPPWPTPSSPSSLRTLPSPS 
LHHGPHHHHLPR*EPFHPQA 
STMAHTI ITFLAENP5 I PKP 
??????????? INTRON II 

CCGACGTCATGCTTCGGGTAAAAAACCCCCTCTTTCCCCAATACCCCACTTCCACTCAAC 

GGCTGCAGTACGAAGCCCATTTTTTGGGGGAGAAAGGGGTTATGGGGTGAAGGTGAGTTG 

P T S C F G * KTPSFPNTPLPLN 

RRHASGKKPPLS PI PHFHST 
D V M LRVKNPLFPQYPTSTQQ 



FIG. 15C 



AACCCATAAATAACTAACAAAAACCCCCTAAACAGCCCCGCATACAAAGGCATCCCCCTC 

TTGGGTATTTATTGATTGTTTTTGGGGGATTTGTCGGGGCGTATGTTTCCGTAGGGGGAG 

NP* ITNKNPLNS PAYKGI PL 
THK*LTKTP*TA ~ H T K A S P S 
PINN*QKPPKQPRIQRHPPR 



GCGTGCGCCACCCTCCTTGAACTCAACCGCATCGACCCCGCCACCTGGGGCAGCGTGTCC 
CGCACGCGGTGGGAGGAACTTGAGTTGGCGTAGCTGGGGCGGTGGACCCCGTCGCACAGG 

ACATLLELNRIDPATWGSVS 
~R A P P S L N S f A S T P P P G A A C P 
VRHPP * TQPHRPRHLGQRVL 



TACAGCTACAACCGCAAAGAAGCCAAGGATCACGGCGAAGGCGGCAACATTGTCGGCGCC 
ATGTCGATGTTGGCGTTTCTTCGGTTCCTAGTGCCGCTTCCGCCGTTGTAACAGCCGCGG 

YSYNRKEAKDHGEGGNIVGA 
~T A T T A K K P R I f A K A A f L S A P 
QLQPQRSQGSRRRRQHCRRR 

Kpnl 

GCTCTGAAGGGCAAGACCGTGCTTGTGATCGACGATGTCATCACGGCCGGTACCGCCATG 
CGAGACTTCCCGTTCTGGCACGAACACTAGCTGCTACAGTAGTGCCGG CCA TGGCGGTAC 

ALKGKTVLVI DDVI TAGTAM 
~L * R A R P C L * S T M S S R P V P P C 
SEGQDRACDRRCHHGRYRHA 



CGTGAGACCCTCAACCTGGTCGCCAAGGAGGGCGGCAAGGTCGTCGGATTCACTGTTGCT 
GCACTCTGGGAGTTGGACCAGCGGTTCCTCCCGCCGTTCCAGCAGCCTAAGTGACAACGA 

RETLNLVAKEGGKVVGFTVA 
~ R P S T W S P R R A A R S S D S L L L 
* DPQPGRQGGRQGRRI HCCS 



CTGGACCGCTTGGAGAAGATGCCCGGACCCAAGGACGAGAACGGTGTCGAGGACGATAAG 
GACCTGGCGAACCTCTTCTACGGGCCTGGGTTCCTGCTCTTGCCACAGCTCCTGCTATTC 

LDRLEKMPGPKDENGVEDDK 
~~W T A W R R C P^ D P R f R T V S R T I S 
GPLGEDARTQGRERCRGR *A 



CC CAGAATGAGTGCTATGGGTCAGAT C CGTAAGGAGTATGGTGTGC C CACGACGAGTATT 
GGGTCTTACTCACGATACCCAGTCTAGGCATTCCTCATACCACACGGGTGCTGCTCATAA 

PRMSAMGQIRKEYGVPTTS I 
"™P E * V L W V R^ S V R S M V C P^ R R V L 

QNECYGSDP*GVWCAHDEYC 



FIG. 15D 



GTTACTCTGGATGATTTGATCAAGTTGATGCAGGCGAAGGGCAATGAGGCCGATATGAAG 
CAATGAGACCTACTAAACTAGTTCAACTACGTCCGCTTCCCGTTACTCCGGCTATACTTC 

VTLDDL I KLMQAKGNEADMK 
LLWMI*SS*CRRRAMRPI*S 
YSG*FDQVDAGEGQ*GRYEA 



CGGTTGGAGGAGTATAGGGCTAAGTATCAGGCTAGTGATTAGTCGGTTTCATTGACCGAT 
GC C AAC C T C C TCAT AT C C CGATT CAT AGT C CGAT CAC T AATCAG C CAAAGT AAC TGGCTA 

RLEEYRAKYQASD * S V S L T D 
GWRSIGLSIRLVISRFH*PI 
VGGV*G*VSG**LVGFIDRL 



TGTTTGGGTGGGTGTGAGAGGTTAGGTTAGGTTGTGGGCGTAGGAATGAAAAGCTGTATA 
ACAAACCCACCCACACTCTCCAATCCAATCCAACACCCGCATCCTTACTTTTCGACATAT 

CLGGCERLG*VVGVGMKSCI 
VWVGVRG*VRLWA*E*KAVY 
FGWV* EVRLGCGRRNEKLYT 



CATAGGGGCCTGAAGAGGTGCGTAGAGACGGTCGTGAGATGTTTTATGTCAAAATCTTGA 
GTATCCCCGGACTTCTCCACGCATCTCTGCCAGCACTCTACAAAATACAGTTTTAGAACT 

HRGLKRCVETVVRCFMSKS* 
IGA*RGA*RRS *DVLCQNLE 
*GPEEVRRDGREMFYVKILN 



ACAAATGACACCTTAAAAAAGACCCCTTGGTTTCAGCTGAATTAGCCCGGAAAGATGCTC 
TGTTTACTGTGGAATTTTTTCTGGGGAACCAAAGTCGACTTAATCGGGCCTTTCTACGAG 

TNDTLKKTPWFQLN* PGKML 
QMTP*KRPLGFS*ISPERCS 
K*HLKKDPLVSAELARKDAR 



GGCACGCCATGAGTCTAGCCCACTCAGTGGGCACCCGTTTCCCACATTTGAAGTGGCCGA 
CCGTGCGGTACTCAGATCGGGTGAGTCACCCGTGGGCAAAGGGTGTAAACTTCACCGGCT 

GTP*V*PTQWAPVSHI *SGR 
ARHESSPLSGHPFPTFEVAD 
HAMSLAHSVGTRFPHLKWPT 



CGCTTATTTGGCTGAGGCTGTGGCCTGGAAAGGCACTATGGCGTGCTGCGGTACAAGGCC 
GCGAATAAACCGACTCCGACACCGGACCTTTCCGTGATACCGCACGACGCCATGTTCCGG 

RLFG*GCGLERHYGVLRYKA 
AYLAEAVAWKGTMACCGTRP 
LIWLRLWPGKALWRAAVQGR 



FIG. 15E 



1501 GGGGCTGGCGTACGAACCACGACGCCCGAAGGGAACTCTTCGGTCTTACTACTACTATGT 
CCCCGACCGCATGCTTGGTGCTGCGGGCTTCCCTTGAGAAGCCAGAATGATGATGATACA 

501 GAGVRTTTPEGNS SVLLLLC 

501 GLAYE PRRPKGTLRSYYYYV 

501 GWRTNHDARRELFGLTTTMS 



1561 CCCCAGTTGACCCCCCGA 
GGGGT C AAC TGGGGGGC T 

521 P Q L T P R 

521 P S * P P 

521 P V D P P 
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